Automatic detection of deceptive opinions using automatically identified specific details
نویسنده
چکیده
Distinguishing deceptive opinions — that is, fabricated views disguised to be genuine — from honest opinions is a hard problem. Deceptive opinions can include things like the false expression of a controversial opinion, a misleading review of an item or service bought online, or deceitful interviews. Unlike many tasks involving language, detecting deceptive opinions through text alone turns out to be quite difficult for humans. Ott et al. (2011) demonstrated this by asking a group of students to judge whether 40 online reviews were truthful or deceptive. These reviews were drawn from their Deceptive Opinion Spam Corpus, introduced in the same paper, and so included an equal number of truthful and deceptive reviews. The students performed at roughly 60% accuracy — only slightly better than the baseline, chance performance of selecting one of two choices (truthful or deceptive), which is 50%. Notably, the students were psychologically biased towards judging more opinions as truthful rather than deceptive. This poor performance suggests that detection of deceptive opinions is a complex area that can greatly benefit from unbiased computational analysis. Much of the research performed on deceptive opinions has used online reviews from Ott et al. (2011)’s corpus as a benchmark because these data are a rich source of opinion spam, a type of deceptive opinion. As e-commerce burgeons, online reviews are becoming increasingly important to company reputations and consumer product assessment. Due to its influential impact on potential customers, deceptive opinion spam is being produced to deceive potential consumers. Whether the opinion spam is sponsored by a company wanting to promote its services (positive opinion spam) or a business maligning its rival with false claims (negative opinion spam), accurate
منابع مشابه
Using linguistically-defined specific details to detect deception across domains
Current automatic deception detection approaches tend to rely on cues that are based either on specific lexical items or on linguistically abstract features that are not necessarily motivated by the psychology of deception. Notably, while approaches relying on such features can do well when the content domain is similar for training and testing, they suffer when content changes occur. We invest...
متن کاملVoting for Deceptive Opinion Spam Detection
Consumers’ purchase decisions are increasingly influenced by user-generated online reviews. Accordingly, there has been growing concern about the potential for posting deceptive opinion spam fictitious reviews that have been deliberately written to sound authentic, to deceive the readers. Existing approaches mainly focus on developing automatic supervised learning based methods to help users id...
متن کاملUsing Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media
Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...
متن کاملCollusion Set Detection Through Outlier Discovery
Digging in the details : a case study in network data mining p. 14 Efficient identification of overlapping communities p. 27 Event-driven document selection for terrorism information extraction p. 37 Link analysis tools for intelligence and counterterrorism p. 49 Mining candidate viruses as potential bio-terrorism weapons from biomedical literature p. 60 Private mining of association rules p. 7...
متن کاملIdentifying Individual Differences in Gender, Ethnicity, and Personality from Dialogue for Deception Detection
When automatically detecting deception, it is important to model individual differences across speakers. We explore the automatic identification of individual traits such as gender, native language, and personality, using acoustic-prosodic and lexical features from an initial non-deceptive dialogue. We also explore predicting success at deception and at deception detection, using the same featu...
متن کامل